Fast Convolutions and Their Applications in Approximate String Matching

نویسندگان

  • Kimmo Fredriksson
  • Szymon Grabowski
چکیده

We develop a method for performing boolean convolutions efficiently in word RAM model of computation, having a word size of w = Ω(log n) bits, where n is the input size. The technique is applied to approximate string matching under Hamming distance. The obtained algorithms are the fastest known. In particular, we reduce the complexity of the Amir et al. [1] algorithm for k-mismatches from O(n √ k log k) to O(n + n √ k/w log k).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fast Algorithm for Approximate String Matching on Gene Sequences

Approximate string matching is a fundamental and challenging problem in computer science, for which a fast algorithm is highly demanded in many applications including text processing and DNA sequence analysis. In this paper, we present a fast algorithm for approximate string matching, called FAAST. It aims at solving a popular variant of the approximate string matching problem, the k-mismatch p...

متن کامل

Agrep — a Fast Approximate Pattern-matching Tool

Searching for a pattern in a text file is a very common operation in many applications ranging from text editors and databases to applications in molecular biology. In many instances the pattern does not appear in the text exactly. Errors in the text or in the query can result from misspelling or from experimental errors (e.g., when the text is a DNA sequence). The use of such approximate patte...

متن کامل

Agrep - A Fast Approximate Pattern-Matching Tool

Searching for a pattern in a text file is a very common operation in many applications ranging from text editor sand databases to applications in molecular biology. In many instances the pattern does not appear in the text exactly. Errors in the text or in the query can result from misspelling or from experimental errors (e.g., when the text is a DNA sequence). The use of such approximate patte...

متن کامل

Approximate String Matching with Gaps

In this paper we consider several new versions of approximate string matching with gaps. The main characteristic of these new versions is the existence of gaps in the matching of a given pattern in a text. Algorithms are sketched for each version and their time and space complexity is stated. The specific versions of approximate string matching have various applications in computerized music an...

متن کامل

Fast Algorithms for Top-k Approximate String Matching

Top-k approximate querying on string collections is an important data analysis tool for many applications, and it has been exhaustively studied. However, the scale of the problem has increased dramatically because of the prevalence of the Web. In this paper, we aim to explore the efficient top-k similar string matching problem. Several efficient strategies are introduced, such as length aware a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Eur. J. Comb.

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2009